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Summary 

Force  reductions,  high  operational  tempos,  and  consolidations  of  career  fields  are 
prompting  the  United  States  Air  Force  to  seek  more  efficient  and  effective  ways  to  train 
the  force  in  order  to  assure  mission  readiness.  One  innovative  way  to  deliver  technical 
training  would  be  the  use  of  animated  pedagogical  agents  within  interactive  learning 
environments.  This  paper  looks  at  recent  applications,  empirical  studies,  and  technical 
challenges  in  using  such  agents  for  training  and  lays  out  a  research  plan  to  acquire 
empirical  knowledge  of  the  technology’s  effectiveness  and  limitations. 

Background 

Over  the  last  40  years,  Air  Force  aircraft  maintenance  career  fields  have 
undergone  numerous  consolidations,  generalizations,  and  force  reductions.  For  example, 
the  current  “Avionics”  Air  Force  Specialty  Code  (AFSC)  was  once  seven  separate  career 
fields  that  included  Navigation,  Communications,  Electronic  Counter  Measures  (ECM), 
Inertial  Navigation,  Weapons  Control,  Instruments,  and  Flight  Control.  These  career 
fields  were  non-weapon  specific,  but  highly  specialized  requiring  42  to  50  weeks  of 
schoolhouse  training.  As  the  AFSC  mergers  took  place,  the  resulting  combined  AFSC 
career  fields  became  less  technically  specialized  and  focused  on  specific  weapon  systems. 
Today,  the  seven  career  fields  are  combined  into  a  single  Avionics  AFSC  requiring  only 
23  weeks  of  schoolhouse  training  (Botello,  Jernigan,  Stimson,  Marquardt,  Kancler, 

Curtis,  Barthol,  Burneka,  &  Whited,  2006).  During  this  same  time  period,  the  age  of 
weapon  systems  being  maintained  has  increased,  spurring  more  frequent  failures  as  well 
as  different  modes  of  failure  (i.e.,  intermittent  failures  due  to  cracking  wiring  harnesses). 
Operational  tempo  has  also  increased  as  the  Air  Force  has  embarked  on  a  series  of 
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operations  over  the  last  17  years  starting  with  Desert  Shield/Desert  Storm  and  continuing 
on  with  the  Global  War  on  Terror  (GWOT)  along  with  various  humanitarian  missions 
(Moseley,  2007).  The  bottom  line  can  be  summarized  as  follows:  Aircraft  maintenance 
technicians  today  receive  less  training,  are  less  specialized,  and  require  more  on-the-job- 
training  at  their  first  assignment.  The  force  reductions  and  high  operational  tempos 
mentioned  above  aggravate  this  state  of  affairs  through  expert  attrition  and  heavy  daily 
workloads  (Botello  et  ah,  2006). 

The  consolidation  of  career  fields  shifting  maintenance  technicians  from 
specialists  to  generalists  falls  in  line  with  today’s  Air  Force  goal  of  a  lighter,  leaner,  agile 
force  that  can  respond  to  contingencies  anywhere  and  anytime.  However,  to  assure 
mission  success,  more  efficient  and  effective  ways  to  train  technicians  of  all  career  fields 
are  needed.  The  emphasis  here  is  on  using  innovative  technologies  that  can  help  in 
reducing  the  cost  and  time  of  training  while  increasing  the  effectiveness  of  training. 

Extensive  literature  reviews  have  illuminated  an  inherent  need  for  job  training 
aids  to  provide  interactive  detailed  instructions  on  performing  complex  tasks  that  are 
either  new  to  the  trainee  or  infrequent  enough  to  prevent  user  proficiency.  The  training 
aid  must  possess  universal  access  to  relevant  training  materials  whenever  and  wherever 
needed.  Today,  training  is  accomplished  through  schoolhouse  coursework  coupled  with 
live  instructors  demonstrating  proper  procedures  and  then  monitoring  progress  of  the 
student  in  accomplishing  the  lesson.  With  the  emergence  of  virtual  reality  and  computer 
graphics  systems,  it  is  possible  that  some  of  this  live  training  may  be  augmented  by  a 
virtual  (computer-generated)  human  maintainer  also  known  as  an  avatar  who  performs  or 
guides  the  required  tasks.  The  literature  refers  to  these  avatars  and  the  virtual  worlds  they 
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inhabit  as  animated  pedagogical  agents  and  interactive  learning  environments, 
respectively.  Avatars  are  used  extensively  in  computer  gaming  for  both  off-line  and  on¬ 
line  games.  This  gaming  environment  may  prove  to  provide  a  significant  advantage  over 
traditional  forms  and  methods  of  training  for  the  “digital  natives,”  (Prensky,  200  la)  a 
term  used  to  describe  the  generation  of  people  who  grew  up  with  computers,  entering  the 
Air  Force  today.  By  mimicking  the  environment  used  in  games  for  entertainment, 
students  may  become  more  engaged  and  enthusiastic  to  leam. 

Applications 

The  Army  Research  Development  and  Engineering  Command  in  Orlando, 
Florida,  and  Forterra  Inc.  in  San  Mateo,  California,  have  developed  a  multiplayer  online 
role-playing  game  (called  Asymmetric  Warfare  Virtual  Training  Technology  or  AW- 
VTT)  which  will  expose  soldiers  to  people  of  different  cultures.  The  goal  of  the  game’s 
development  is  to  make  soldiers  think  critically  about  their  surroundings  and  enhance 
their  situational  awareness.  The  game  is  set  up  to  continuously  operate  24  hours  a  day 
and  allow  participants  to  venture  in  and  out  of  scenarios  as  they  choose.  A  highlight  of 
the  system  is  that  both  deployed  and  soon-to-be  deployed  soldiers  can  play  the  game 
simultaneously,  allowing  deployed  soldiers  to  impart  their  knowledge  to  those  without 
the  experience.  A  few  of  the  roles  which  participants  can  play  include  civilians, 
insurgents,  and  national  guardsmen.  A  soldier’s  avatar  can  also  take  on  his/her  physical 
characteristics.  A  soldier’s  physical  qualification  scores  can  be  fed  into  the  game 
enabling  the  avatar  to  have  the  same  physical  limitations/strengths  as  the  person  in  real 
life.  Although  half  complete,  AW-VTT  was  advanced  enough  to  be  played  with  the 
Illinois  National  Guard  Artillery  Battalion.  The  group  had  no  deployment  experience  and 


3 


the  AW-VTT  exposed  them  to  cultural  differences  and  possible  hostile  occurrences  at 
simulated  checkpoint  operations  (Peck,  2005). 

The  Army  has  also  worked  with  Will  Interactive  to  develop  digital  game -based 
learning  systems  to  encourage  individuals  to  think  as  teams.  The  Army  saw  that  teams 
were  not  pulling  together  in  crisis  situations  and  gaming  might  be  a  solution.  The 
company  developed  Saving  Sergeant  Pabletti  which  is  used  by  over  80,000  soldiers  a 
year  to  practice  team  skills.  Drill  sergeants  can  use  the  interactive  game  with  up  to  300 
persons  at  a  time.  In  addition  to  team  skills,  other  training  goals  were  included  such  as 
sexual  harassment  prevention,  army  values,  equal  opportunity,  and  cross  cultural 
communication.  From  the  use  of  this  digital  learning  system,  the  Army  saw  training 
hours  reduced  from  15  hours  to  4  hours  (Prensky,  200  lb). 

The  Navy,  as  well  as  the  Army,  conducts  as  much  training  as  possible  through 
simulated  means.  In  1991,  the  Chief  of  Navy  Technical  Training  asked  Dr.  Henry  Halff,  a 
research  psychologist,  to  develop  a  computer  game  for  basic  electricity  and  electronics 
training  for  avionics  technicians.  Dr.  Halff  knew  traditional  education  was  limited  in 
tenns  of  providing  opportunities  to  practice  newly  acquired  skills  and  constant 
motivation.  From  this,  he  thought  adventure  gaming  might  be  an  effective  mode  of 
instruction.  Halff  and  his  team  created  the  Electro  Adventure  game.  The  program 
combined  adventure  gaming  and  traditional  computer  based  training  elements  to  allow 
the  participants  to  discover  technical  and  safety  problems  in  a  ship’s  compartment  in  a 
given  scenario  (Prensky,  200  lb). 

The  Air  Force  has  also  looked  at  computer  gaming  through  its  development  of 
Falcon  4.0,  a  commercially  developed  game.  The  game  is  a  low  cost  simulator  that 
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allows  pilots  to  practice  their  skills  during  deployments.  This  enables  the  Air  Force  pilots 
to  maintain  their  proficiency.  Even  the  Marines  are  looking  at  adapting  commercially 
available  games  for  training.  The  commercial  game,  Doom,  has  been  used  to  teach 
teamwork,  communication,  and  command  and  control  concepts  rather  than  just  shooting 
and  killing.  It  is  played  as  a  networked  game  where  four  member  teams,  each  with  a 
computer,  play  through  a  combat  training  scenario.  Sound  effects  through  the  game  and 
participant  communication  add  to  player’s  confusion  and  disorder.  Players  enhance  their 
skills  of  strategy  and  tactics  as  they  advance  through  the  game  and  destroy  the  enemy 
(Prensky,  200  lb). 

Bromwich  (2007)  mentions  that  one  of  first  corporate  dealings  with  avatars 
included  Clippy,  the  animated  paperclip  of  Microsoft  Office.  This  avatar,  for  the  most 
part,  was  an  annoying  office  assistant  which  would  appear  in  the  bottom  corner  of  one’s 
computer  screen  and  offer  help  in  word  processing.  Since  then,  avatars  have  come  a  long 
way  in  the  workplace.  CDW  Corporation,  a  technology  products  and  services  company, 
sought  development  of  a  sales  training  course  for  their  employees  through  Accenture  Ltd. 
In  the  course,  avatars  were  to  take  on  the  persona  of  coaches  and  guide  the  trainees 
through  mock  interactions  with  varied-response  customers.  Bromwich  reports  from  Bruce 
Darner,  author  of  Avatars!  Exploring  and  Building  Virtual  Worlds  on  the  Internet,  two 
examples  of  avatar  technology  use  in  industry.  Boeing,  he  points  out,  has  used  avatars  in 
a  virtual  Super-jumbo  Plant  to  resolve  issues  with  the  giant  Airbus.  The  avatars  can  take 
on  an  insect  view  of  the  plane  and  see  fine  details.  Once  a  problem  is  discovered,  the 
avatars  disappear  and  the  experience  is  recorded  for  future  analysis.  In  addition,  EDS 
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used  a  virtual  avatar  to  “conduct  weekly  classes  for  EDS  employees  worldwide”  which 
aided  the  company  in  “corporate  problem  solving  and  training”  (Bromwich,  2007). 

Avatars  in  virtual  worlds  have  also  been  applied  to  medical  education  with 
possibilities  in  individual  learning,  team  training,  and  complex  team  training.  The 
SUMMIT  development  team  at  Stanford  University  has  designed  a  web-based  3D  virtual 
world  for  medical  purposes  in  chemical,  biological,  radiological/nuclear,  and  explosive 
(CBRNE)  incidents.  Multiple  players  access  the  world  for  training  with  an  emphasis  on 
interactions  betweens  team  members.  The  virtual  world  is  not  appropriate  for  procedural 
skills  (say  insertion  of  a  needle)  like  it  is  for  participant  training  in  leadership, 
communication,  and  team  training.  In  this  simulation-based  learning,  individual  learners 
control  avatars  in  a  3D  world  where  a  CBRNE  event  has  occurred.  Participants  can  also 
be  dispersed.  Learners  interact  with  the  simulated  victims,  as  well  as  one  another,  while 
their  instructor  oversees  the  exercises.  The  training  is  usually  followed  up  with  an  after¬ 
action  review  or  debriefing  with  the  instructor  and  participants  to  review  items  that 
worked  well  and  items  which  could  have  been  performed  differently  (Dev,  Youngblood, 
Heinrichs,  &  Kusumoto,  2007). 

Empirical  Studies 

Since  the  application  of  avatars  in  training  environments  is  still  relatively  new, 
very  little  empirical  knowledge  exists  as  to  their  effectiveness  in  these  environments. 
While  technology  is  getting  better  at  providing  more  realistic  representations  of  human 
characteristics  and  enabling  more  natural  communication  between  the  student  and  avatar, 
we  still  do  not  know  how  effective  the  technology  is  for  achieving  training  goals  or 
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where  it  would  be  best  to  apply  the  technology.  Johnson,  Rickel,  and  Lester  (2000)  report 
on  two  empirical  studies  of  animated  pedagogical  agents. 

The  first  study  was  conducted  on  the  Herman  the  Bug  agent  that  inhabits  the 
Design-A-Plant  learning  environment.  Design-A-Plant  is  a  design-centered  learning 
environment  where  students  explore  the  physiological  and  environmental  factors  that 
determine  the  survival  of  a  particular  plant  design.  The  purpose  of  this  experiment  was  to 
obtain  a  baseline  of  the  potential  effectiveness  of  animated  pedagogical  agents  on 
problem  solving  and  study  the  effects  of  different  levels  of  interaction  employed  by  the 
animated  pedagogical  agents.  The  study  involved  one  hundred  middle  school  students, 
each  interacting  with  one  of  five  versions  of  the  Herman  the  Bug  agent.  The  versions 
differed  based  on  modes  of  expression  and  the  level  of  advice  they  offered  with  respect  to 
the  students’  problem  solving  activities.  The  five  versions  are  listed  as  follows: 

•  Muted:  This  version  provided  no  advice  at  all  and  was  used  as  an  experimental 
control 

•  Task-Specific  Verbal:  This  version  provided  only  task-specific  (low-level) 
verbal  advice 

•  Principle-Based  Verbal:  This  version  provided  only  principle-based  (high-level) 
verbal  advice 

•  Principle-Based  Animated/Verbal:  This  version  provided  principle-based 
advice  using  both  animated  and  verbal  responses. 

•  Fully  Expressive:  This  version  provided  both  principle-based  and  task-specific 
advice  using  both  animated  and  verbal  responses. 
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The  learning  environment  captured  all  interactions  during  the  problem  solving 
activities.  The  students  were  given  a  pre-test  and  post-test  to  measure  baseline  knowledge 
and  improvement.  The  results  were  as  follows: 

•  Students  interacting  with  animated  pedagogical  agents  showed  statistically 
significant  increases  in  knowledge  from  pre-test  to  post-test,  thus  establishing  that 
a  well  designed  virtual  learning  environment  can  successfully  transfer  knowledge. 

•  Animated  pedagogical  agents  that  employed  multiple  levels  of  advice  and 
multiple  modes  of  providing  that  advice  yielded  greater  improvements  on  post¬ 
tests  than  less  expressive  agents,  demonstrating  higher  benefits  in  using 
multimodal  agents  that  provide  both  theoretical  and  practical  advice. 

•  Positive  effects  of  animated  pedagogical  agents  were  more  pronounced  when 
students  dealt  with  more  complex  problems,  indicating  the  potential  effectiveness 
of  agents  in  assisting  students  in  solving  complex  technical  problems  (Lester, 
Converse,  Stone,  Kahler,  &  Barlow,  1997). 

The  second  study  cited  involved  the  German  Research  Center  for  Artificial 
Intelligence  that  conducted  an  experiment  to  evaluate  the  effectiveness  of  the 
Personalized  Plan-Based  Presenter  (PPP)  agent  in  facilitating  learning.  Two  versions  of 
a  learning  environment  were  created.  One  used  the  PPP  agent  to  point  out  specific  areas 
of  interest.  The  other  did  not  use  the  PPP  agent,  but  instead  used  simple  arrows  to 
identify  specific  items.  Both  used  the  exact  same  narration.  The  test  subject  pool 
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consisted  of  30  adult  participants  (15  female  and  15  male)  recruited  from  Saarbrucken 
University.  The  test  subjects  were  asked  to  view  two  types  of  presentations.  One  type  was 
a  technical  description  of  different  pulley  systems  and  the  other  was  a  non-technical 
description  of  fictitious  office  employees  (Johnson  et  ah,  2000).  Unlike  the  Design-A- 
Plant  experiment,  the  subjects  did  not  have  to  do  any  problem  solving.  The  learning 
effect  was  measured  through  questions  of  comprehension  and  recall  following  the 
presentations.  Qualitative  data  was  also  collected  through  a  questionnaire  at  the  end  of 
the  experiment. 

The  results  of  this  study  showed  that  there  were  no  differences  between  the  test 
subjects’  comprehension  of  the  infonnation  presented  based  on  the  presence  of  the  PPP 
agent.  However,  the  qualitative  data  gathered  indicated  that  most  of  the  test  subjects 
preferred  the  presentations  with  the  PPP  agent.  Also,  most  subjects  stated  that  the 
technical  presentations  were  easier  to  understand  and  more  entertaining  with  the  PPP 
agent  (Andre,  Rist,  &  Muller,  1999). 

There  have  been  other  studies  looking  at  the  effects  of  animated  pedagogical 
agents  where  learning  was  affected.  Atkinson  (2002)  performed  two  experiments  which 
looked  at  incorporating  pedagogical  agents  in  a  computer-based  environment  to  teach 
learners  to  solve  word  problems.  In  the  first  experiment,  two  questions  were  asked: 

•  Are  examples  coupled  with  pedagogical  agents  more  effective  at  promoting 
learning  than  examples  without  the  agent  present? 

•  Are  examples  containing  aural  or  textual  explanations  (both  paired  with  an  agent) 
more  effective  at  promoting  learning? 
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With  this,  the  author  set  out  to  design  a  study  where  50  undergraduate  students 
from  the  Educational  Psychology  and  Psychology  Departments  at  Mississippi  State 
University  were  used.  Students  were  randomly  assigned  to  five  conditions:  voice-plus- 
agent,  text-plus-agent,  voice  only,  text  only,  or  control  (where  no  instructional 
explanation  or  agent  was  present).  Participants  participated  in  two  sessions.  A  pre-test 
and  post-test  were  used  to  measure  information  transfer  and  a  five-point  Likert  scale 
questionnaire  was  used  to  measure  reported  example  difficulty.  Voice  conditions  (voice- 
plus-agent  and  voice  only)  were  found  to  have  a  significant  effect  on  reported  example 
difficulty.  Participants  exposed  to  the  voice  conditions  had  less  difficulty  in  solving  word 
problems  than  those  presented  with  text-based  explanations.  Those  presented  with  voice 
also  outperformed  those  exposed  to  text  explanations  in  transfer  scores  (Atkinson,  2002). 

From  the  first  experiment,  the  author  learned  that  low  power  resulting  from  the 
number  of  participants  in  the  experiment  negatively  affected  the  statistics.  Therefore,  for 
the  second  experiment,  the  two  sessions  were  collapsed  into  one  session.  Also,  the 
reliability  of  the  post-test  was  improved  by  adding  more  items  and  two  conditions  were 
removed  leaving:  voice-plus-agent,  voice  only,  and  text  only.  Here,  the  following 
questions  were  asked: 

•  Are  examples  paired  with  visual  and  auditory  presence  of  agents  more  effective  at 
promoting  learning  than  in  text  only  examples? 

•  Are  examples  paired  with  visual  and  auditory  presence  of  agents  more  effective  at 
promoting  learning  than  in  voice  only  examples? 
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Seventy-five  undergraduate  students  from  the  same  university  participated  in  the 
second  experiment  while  being  randomly  assigned  to  one  of  the  three  conditions.  The 
same  measures  were  carried  over  to  this  experiment.  The  three  conditions  statistically 
differed  from  one  another  for  example  difficulty.  The  participants  in  the  voice-plus-agent 
condition  outperformed  those  in  the  text  only  condition.  Also,  for  transfer  information, 
students  in  the  voice-plus-agent  condition  performed  superior  to  those  in  both  the  voice 
only  condition  and  text  only  condition.  It  was  shown  from  both  experiments  that 
pedagogical  agents  presented  along  with  aural  capabilities  can  help  optimize  student 
learning  (Atkinson,  2002). 

A  study  investigating  computer-based  learning  for  students  was  conducted  via 
two  experiments.  The  major  difference  in  the  two  experiments  was  the  pool  of  subjects 
used  (the  first  used  college  students  whereas  the  second  used  seventh  grade  students). 
Students  were  taught  how  to  design  the  roots,  stems,  and  leaves  of  plants  to  mature  in 
various  environments  (Moreno,  Mayer,  Spires,  Hiller,  &  Lester,  2001). 

The  first  experiment  was  to  detennine  whether  pedagogical  agents  helped 
students  in  their  deep  learning  of  the  social  agency  environment  more  than  infonnation 
presented  with  text  on-screen.  The  measure  of  deep  learning  was  defined  as  a  student’s 
ability  to  rely  on  previous  knowledge  of  a  subject  matter  to  create  mental  models  which 
would  aid  in  problem  solving.  From  a  psychology  subject  pool  at  the  University  of 
California,  Santa  Barbara,  44  students  participated.  Subjects  were  randomly  assigned  to 
two  conditions  including  PA  (aural  pedagogical  agent)  or  No  PA  (text  only  infonnation). 
Retention,  transfer,  and  interest  were  measured  through  post  retention  tests,  post 
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problem-solving  tests,  and  ten-point  rating  scales,  respectively.  For  retention,  students 
learning  with  the  aid  of  a  PA  did  not  statistically  differ  in  their  retention  from  those  in  the 
No  PA  condition.  In  effects  on  transfer,  students  in  the  PA  condition  produced 
significantly  more  correct  answers  than  those  in  the  No  PA  condition.  Those  introduced 
to  the  PA  also  significantly  produced  more  correct  answers  on  difficult  transfer  problems 
than  the  No  PA  group.  Effects  on  student  interest  of  the  material  showed  that  the  PA 
group  rated  a  significantly  greater  interest  of  the  material  than  did  the  No  PA  group. 

Also,  students  in  the  PA  group  significantly  had  stronger  interest  in  continuing 
instruction  in  their  program  versus  the  No  PA  group.  The  first  experiment  was  replicated 
on  a  younger  set  of  seventh-grade  students  in  a  second  experiment  with  consistent  results 
over  both  student  populations.  That  is,  the  presence  of  a  pedagogical  agent  positively 
affects  students’  deep  learning  of  infonnation.  In  addition,  students  reported  more  interest 
in  a  social  agency  environment  (one  with  the  PA)  rather  than  the  text  only  presentation 
environment  (Moreno  et  al.,  2001). 

Finally,  another  study  looked  at  an  agent,  AutoTutor,  and  measured  its  effects  on 
deep  learning  and  shallow  learning  (Graesser,  Moreno,  Marineau,  Adcock,  Olney,  & 
Person,  2003).  The  authors  defined  deep  learning  as  the  understanding  of  consequences 
of  events  and  deriving  methods  for  problem  solving,  whereas  shallow  learning  consisted 
of  basic  infonnation  in  a  field  relating  to  definitions,  properties,  and  examples  of 
technical  components.  The  agent  uses  natural  language  to  communicate  to  students.  In 
this  experiment,  the  authors  sought  to  find  whether  dialog  presentation  of  infonnation  or 
the  media  presentation  of  information  was  what  facilitated  learning.  Eighty-one  students 
participated  in  a  study  to  become  computer  literate  (hardware,  operating  systems,  and 
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internet).  AutoTutor  could  be  accompanied  by  any  of  the  combinations:  print  (text)  only, 
speech  only,  agent  with  speech,  and  agent  with  speech  and  text.  In  all,  four  conditions 
were  presented  to  the  students.  Pre-test  and  post-test  were  given  to  assess  the  students’ 
deep  and  shallow  knowledge.  Results  showed  significant  effects  of  training  on  deep 
scores  rather  than  shallow  scores  where  the  AutoTutor  agent  was  present.  All  versions  of 
AutoTutor  were  superior  in  improved  deep  learning  versus  instances  without  the  agent 
present. 

Technical  challenges 

As  with  any  software  environment,  there  are  technical  issues  associated  with 
intelligent  tutoring  systems  as  well  as  interactive  learning  environments.  Some  of  the 
major  issues  revolve  around  representing  and  reasoning  about  domain  knowledge, 
modeling  and  adaptation  to  user’s  knowledge,  choosing  appropriate  pedagogical 
strategies,  and  maintaining  a  coherent  and  understandable  dialogue. 

Virtual  reality  (VR)  is  a  user  interface  paradigm  in  which  the  user  feels  immersed 
in  a  computer-generated  space,  whereas  a  virtual  world  describes  virtual  reality  systems 
that  allow  multiple  users  to  interact  in  the  same  space.  The  virtual  world  is  a  three- 
dimensional  representation  of  an  environment  that  the  user  operates.  Due  to  the  multiple 
networking  of  clients,  there  are  several  technical  problems  associated  with  databases 
required  to  support  a  multi-user  VR  system.  Applications  currently  being  explored  by 
commercial  industry  include  research  at  Microsoft.  This  research  has  explored  an  object- 
oriented  multi-user  world  in  which  a  network  database  server  stores  objects  having 
properties  and  methods.  Room  objects,  representing  discrete  locations,  interconnected  by 
portal  objects  with  each  room  having  a  descriptive  text,  which  users  read  to  situate 
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themselves  in  the  location,  define  the  topology  of  the  space.  Also,  objects  can  represent 
things  located  in  a  room  and  objects  called  “players”  or  “avatars”  represent  a  user’s 
character  in  the  virtual  world  (Vellon,  Marple,  Mitchell,  &  Drucker,  1997). 

Programming  behavior  in  virtual  worlds  is  accomplished  by  defining  methods  on 
objects  in  the  environment.  In  the  basic  object  model,  a  few  basic  objects  are  provided 
such  as  “Rooms,”  “Portals,”  “Artifacts,”  and  “Avatars.”  An  Avatar  has  a  variety  of 
properties  and  methods  to  specify  the  object  representing  the  user  in  the  world.  Some  of 
these  properties  include  the  gender  of  the  avatar,  a  list  of  friends,  the  room,  and  user 
information  (Vellon  et  ah,  1997). 

An  animated  pedagogical  agent’s  environment  consists  of  the  learning 
environment,  the  user,  and  any  other  agents  in  the  learning  environment.  In  order  to 
understand  the  inner  workings  of  such  agents,  it  is  necessary  to  discuss  the  interface 
between  the  agent  and  the  pedagogical  environment.  The  interface  is  divided  into  two 
parts  consisting  of  the  agent’s  awareness  of  the  environment,  or  perception,  and  its  ability 
to  affect  the  environment,  or  motor  actions.  One  of  the  primary  motivations  for  animated 
pedagogical  agents  is  to  expand  the  bandwidth  of  human-computer  interaction,  thus  their 
perception  and  motor  actions  are  typically  more  diverse  than  previous  computer  tutors 
and  learning  companions  (Johnson  et  ah,  2000). 

Most  animated  pedagogical  agents  track  the  state  of  the  problem  that  the  user  is 
addressing.  Other  agents  track  more  unusual  events  in  their  environment  such  as  speech 
events,  location  within  the  environment,  visual  attention,  gestures,  and  interaction 
between  an  agent’s  body  and  its  environment.  Current  advanced  research  may  focus  on 
other  features  such  as  a  user’s  facial  expressions  and  emotions.  These  advanced  and 
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multiple  states  add  to  an  already  difficult  database  to  address  and  process  the  agents  of  a 
virtual  world.  Interactions  between  an  agent’s  body  and  its  environment  require  spatial 
knowledge  of  the  environment  being  used.  This  is  one  of  the  key  motivations  for 
animated  pedagogical  agents  that  include  the  ability  to  look  at  objects,  point  to  them, 
demonstrate  how  to  manipulate  them  and  navigate  around  them.  This  is  important  for  any 
future  learning  environment  for  both  the  teacher  and  the  user.  To  date,  relatively  simple 
representations  of  spatial  knowledge  have  allowed  animated  pedagogical  agents  to 
accomplish  their  needs.  These  simple  representations  include  task  bars  and  bitmap 
images  that  allow  the  agents  to  quickly  conduct  activities  such  as  locomotion  to  include 
walking,  sitting,  and  standing  as  well  as  gestures  and  behaviors.  However,  these  are  just 
the  simple  problems  associated  with  navigating  in  a  virtual  world  (Johnson  et  ah,  2000). 

Complex  algorithms  must  be  utilized  in  order  to  avoid  collisions  between  users 
and  their  environments  to  include  other  users.  It  is  critical  that  control  points,  the 
coordinates  to  be  navigated,  be  interpolated  in  a  manner  that  (1)  enables  the  agent’s 
movement  to  appear  smooth  and  continuous  and  (2)  guarantees  retaining  the  collision- 
free  property.  To  accomplish  this  natural  behavior,  the  navigation  planner  generates  a 
Bezier  spline  that  interpolates  the  discretized  path  from  the  agent’s  current  location, 
through  each  successive  control  point,  to  the  target  destination  (Johnson  et  ah,  2000). 

To  affect  their  environment,  pedagogical  agents  require  a  vast  repertoire  of  motor 
actions.  These  actions  fall  into  speech,  body  control,  and  learning  environment  control. 
Typically,  speech  is  generated  as  a  text  string  to  allow  one  agent  to  speak  to  another  and 
may  be  displayed  as  text  or  processed  through  a  speech  synthesizer.  Body  control  may 
involve  playing  animation  clips  for  the  entire  body  or  specific  segments  for  gestures, 
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expressions,  and  locomotion.  Environment  control  may  include  changing  background 
music,  environment  background  or  scenery,  and  responses  to  object  manipulation  such  as 
the  response  from  pushing  a  button.  These  actions  formulate  the  behavioral  building 
blocks  of  the  virtual  world  and  learning  environment  (Johnson  et  ah,  2000). 

In  order  to  design  the  behavior  of  an  agent,  the  building  blocks  of  an  agent’s 
behavior  must  be  generated  along  with  the  code  that  will  select  and  combine  the  right 
building  blocks  in  order  to  respond  correctly  to  the  situation.  The  most  common  method 
for  generating  behavior  of  a  pedagogical  agent  is  the  behavior  space  approach.  A 
behavior  space  is  a  library  of  behavior  fragments  and  is  generated  by  dynamically 
stringing  behavior  fragments  together  at  runtime.  Creating  these  behavioral  fragments 
can  range  from  the  simple  to  extremely  sophisticated  depending  on  the  desired  state  of 
animation.  In  order  to  select  the  appropriate  behavior  at  runtime,  each  behavior  fragment 
must  be  associated  with  additional  information  that  describes  its  content.  This  is 
accomplished  through  the  use  of  ontological,  intentional,  and  rhetorical  indexes. 
Ontological  indexes  are  used  for  explanatory  behaviors,  intentional  indexes  for  advisory 
behaviors,  and  rhetorical  indexes  for  audio  segments.  This  building  block  approach 
provides  very  high  quality  animations  but  is  limited  because  designers  must  anticipate  all 
of  the  behavior  segments  and  develop  complicated  rules  for  assembling  them  together 
(Johnson  et  ah,  2000). 

Alternatives  to  the  behavioral  space  approach  include  generating  behavior 
dynamically  as  well  as  controlling  behavior.  Flexibility  is  achieved  by  generating 
behavior  as  it  is  needed  without  using  any  previously  saved  animated  segments  or 
individual  frames  as  is  the  case  in  the  behavioral  space  method.  However,  using  dynamic 
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behavior  that  is  generated  on  the  spot  includes  tradeoffs  such  as  decreased  quality  of 
images  and  fragmented  speech  due  to  real-time  data  processing.  Stored  images  and  sound 
clips  of  behavior  space  modeling  eliminates  the  quality  issue  but  is  constrained  to 
delivering  only  the  behaviors  stored  in  a  database.  Behavior  control  includes  the  use  of 
pedagogical  context,  task  context,  and  dialogue  context.  Respectively,  this  includes  the 
instructional  goals  and  a  model  of  the  student’s  knowledge,  the  state  of  the  users  and 
agent’s  problem  solving,  and  the  state  of  the  collaborative  interaction  between  the  user 
and  the  agent  (Johnson  et  ah,  2000).  Ideally,  a  combination  of  using  stored  data  from  the 
behavior  space  as  well  as  processing  on  the  spot  new  behavior  from  dynamic  behavior  or 
behavior  control  would  work  much  better.  The  main  obstacle  is  overcoming  and 
controlling  the  behavior  of  an  animated  agent  by  synchronizing  its  nonverbal  actions  with 
its  verbal  utterances.  Overall,  the  primary  goal  is  for  the  interactive  environment  to  be 
believable.  By  creating  the  illusion  of  life,  dynamically  animated  agents  have  the 
potential  to  increase  the  amount  of  time  users  spend  with  educational  software  (Johnson 
et  ah,  2000).  Moreover,  increased  technological  advances  have  made  graphics  hardware 
more  affordable  and  the  widespread  distribution  of  real-time  animation  technology  a 
reality. 

One  of  the  current  virtual  worlds  created  that  has  addressed  several  of  the 
aforementioned  technical  difficulties  is  the  Internet  based  Second  Life  program  that  was 
launched  in  2003  (Sege,  2006).  Second  Life  was  developed  by  Linden  Research,  Inc.,  and 
enables  users  to  interact  within  a  social  network  in  a  virtual  environment.  Avatars  are 
used  to  identify  users  and  communication  is  made  available  through  local  chat  and  instant 
messaging.  Locomotion  is  available  in  the  forms  of  walking,  jumping,  flying,  as  well  as 
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riding  in  vehicles  and  teleportation  to  a  specific  location.  Current  technical  obstacles 
include  the  database  engine  freezing  due  to  severe  load  conditions  by  users.  These 
overloads  may  cause  objects  to  disappear  unexpectedly,  searches  to  fail,  or  total  system 
shut  down.  Linden  addressed  these  issues  along  with  a  primary  problem  known  as  the 
Deep  Think  condition  with  its  Havok  4  physics  engine.  The  Deep  Think  problem  is 
encountered  when  two  physical  objects  intersect  one  another  in  the  virtual  world  and  the 
engine  does  not  know  how  to  separate  the  objects.  Thus,  the  program  goes  into  a 
recursive  loop  trying  to  analyze  the  overlapping  objects  and  eventually  consumes  all 
available  CPU  power  and  reduces  the  simulator  to  a  crawl.  Linden  addressed  the  physics 
based  problem  by  creating  an  overlap  ejection  capability  that  allows  overlapped  objects 
to  separate  and  propel  apart  just  as  in  compressing  two  springs  against  each  other.  To 
address  the  scripting  problems  and  help  make  the  user  experience  more  pleasant,  Linden 
has  developed  the  LSL  scripting  language  that  is  similar  to  C.  Basically,  LSL  determines 
the  behavior  of  objects  using  a  combination  of  primitive  shapes,  or  “prims”  which  are 
sized,  scaled,  and  stretched  as  needed,  with  images,  called  “textures,”  applied  to  the 
surface  of  the  prim  to  alter  its  appearance.  LSL  detennines  how  the  objects  behave, 
controlling  the  waving  of  virtual  hair,  driving  a  vehicle,  or  other  animated  object. 
Everything  in  Second  Life  is  composed  of  some  combination  of  prims,  textures,  and  LSL 
scripts  (Greenemeier,  2005). 

Currently,  Second  Life  is  innovating  Internet  use  and  is  clearly  in  the  early  stages 
of  the  design  process.  However,  growth  escalated  from  18,000  in  Dec  2006  to  36,000 
users  in  March  2007  and  is  currently  estimated  to  have  over  1  million  users.  To  handle 
such  an  enonnous  computational  burden,  Second  Life  runs  on  2,000  Intel  processors  in 
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two  co-located  facilities  in  San  Francisco  and  Dallas  where  the  servers  support  over  34 
terabytes  of  user  created  content  (Wagner,  2007).  Aside  from  the  virtual  social  network, 
Linden  is  exploring  the  virtual  education  environment.  Second  Life  has  recently  become 
one  of  the  cutting  edge  virtual  classrooms  for  major  colleges  and  universities. 

Universities  such  as  Harvard,  Princeton,  and  Vassar  have  already  begun  teaching  using 
Second  Life  as  a  3D  computer  program  to  host  interactive  learning.  Universities  are  able 
to  buy  land  within  Second  Life  for  educational  purposes  and  block  access  from  outsiders. 
This  capability  has  eliminated  a  cost  burden  for  developing  new  training  programs  and 
tools  for  education.  Likewise,  the  interactive  and  virtual  environment  catches  the  user’s 
or  student’s  attention  and  makes  learning  fun.  Selective  areas  chosen  for  lessons  allow  for 
advanced  learning  such  as  philosophy  seminars  in  a  Japanese  Zen  garden  (Parker,  2007). 

However,  Second  Life  is  not  without  its  problems.  Since  the  virtual  world  is 
created  through  programming  language,  there  are  security  risks.  One  area  of  concern  for 
parents  is  online  stalking  and  sexual  predators  that  may  be  lurking  in  the  virtual  world. 
Also,  advanced  users  may  hack  into  holes  in  the  source  code  where  they  can  alter  other 
users’  avatars  and  activities  such  as  disabling  their  movement  (Lagorio,  2007).  Also, 
malicious  users  may  disrupt  user  environments  by  destroying  scenery  or  simply  leaving 
thousands  of  items  on  the  ground  in  the  form  of  cyber  littering  which  all  that  enter  the 
environment  will  see.  It  is  clear  that  Linden  and  Second  Life  are  in  the  early  stages  of  the 
design  process.  No  dominant  design  has  occurred  and  current  development  is  at  the  low 
end  of  Foster’s  S-curve  (Utterback,  1996).  Due  to  the  lack  of  a  dominant  design,  there 
exists  the  potential  for  a  competitor  to  enter  the  virtual  world  against  Linden  and  disrupt 
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its  current  technological  advancement  of  the  Internet  along  with  the  virtual  education 
market. 

Way  ahead 

As  the  technical  challenges  become  settled  and  a  dominant  design  emerges,  it 
would  seem  reasonable  that  the  application  of  animated  pedagogical  agents  and  their 
associated  interactive  learning  environments  would  only  increase.  In  fact,  Air  Education 
and  Training  Command  (AETC)  has  expressed  interest  in  using  avatars  in  virtual  Air 
Force  bases  for  training  similar  to  Linden’s  Second  Life  in  their  White  Paper  On 
Learning:  the  Future  of  Air  Force  Education  and  Training  (2008).  In  light  of  this 
expansion  of  various  training  applications,  empirical  studies  probing  into  the  technology 
effectiveness  for  Air  Force  training  are  critical  in  assessing  return  on  investment.  What 
follows  is  a  deliberate  method  of  research  that  the  Logistics  Readiness  Branch  of  the  Air 
Force  Research  Laboratory’s  Human  Effectiveness  Directorate  is  currently  executing 
through  the  Maintenance  Aiding  and  Training  Technology  Experimental  Research 
(MATTER)  effort. 

Under  the  Advanced  Visual  and  Instruction  Systems  for  Maintenance  Support 
(AVIS-MS)  research  project,  the  utility  of  augmented  reality  was  investigated  for  the 
presentation  of  procedural  instructions  during  maintenance  tasks  and  the  acquisition  of 
task  data  from  expert  task  performance  on  a  real  device.  One  of  the  lessons  learned 
during  this  research  is  that  obtaining  human  performance  information  from  tasks 
involving  full  scale  physical  props  remains  a  challenge  for  common  motion  capture 
technologies.  These  limitations  have  inspired  further  research  of  a  new  untethered  motion 
capture  technology  that  has  matured  to  a  point  that  is  viable  for  study. 
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The  Untethered  Motion  Capture  Evaluation  for  Flightline  Maintenance  research 
project  is  exploring  and  evaluating  the  utility  of  novel  motion  capture  technologies  within 
the  Air  Force  maintenance  domain.  A  primary  objective  is  determining  the  potential  of 
untethered  motion  capture  capabilities  for  real-time  human  subject  motion  capture  and 
performance  data  collection  with  full  scale  physical  props.  Data  will  be  collected  and 
evaluated  during  a  maintenance  task  scenario  for  the  purpose  of  instruction  generation 
and  maintenance  training.  The  effort  consists  of  a  domain  analysis,  a  conceptual  design 
and  architecture  definition,  a  prototype  development,  and  a  performance  evaluation  of 
motion  and  task  matching  algorithms  within  relevant  operational  maintenance  scenarios. 
Faboratory  and  field  research  will  be  used  to  evaluate  the  efficacy  of  untethered  motion 
capture  for  obtaining  human  performance  data  from  tasks  involving  full  scale  physical 
props  and  the  reuse  of  this  information  within  augmented  procedural  instruction. 

The  Technical  Instruction  Multimedia  Extensions  for  Training  (TIMET)  research 
project  will  investigate  the  selective  addition  of  multimedia  content,  especially 
segmented  video  and  3D  animated  virtual  maintainers,  to  existing  technical  instruction 
presentations  as  a  training  aid.  The  research  will  review  available  resources  concerning 
animated  virtual  maintainers,  identify  suitable  maintenance  tasks  that  can  be  visualized 
through  virtual  human  interfaces,  prototype  a  virtual  human  interface  within  existing 
electronic  technical  instructions  as  rendered  on  a  portable  electronic  device,  develop 
semantic  annotation  of  Computer  Aided  Design  (CAD)  models  (manual  labeling)  for  part 
reference  during  subject  performance  monitoring,  and  investigate  preliminary  coaching 
behaviors  based  on  evaluation  of  subject  motion  capture  while  on  task.  The  computer 
generated  virtual  humans  will  interface  with  CAD  models  of  the  selected  maintained 
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systems  in  such  a  fashion  that  design  updates  to  the  CAD  models  may  automatically 
generate  plausible  new  animations  for  the  virtual  maintainer. 

The  Virtual  Coaching  Agent  for  Team  Training  (VC ATT)  research  project  will 
investigate  an  interactive,  computer-generated,  human  agent  to  guide  a  subject  or  team  of 
subjects  during  a  maintenance  task  training  scenario  and  measure,  evaluate,  and  verily 
trainee(s)  actions  against  an  established  expert  model  of  task  performance.  The  coach 
will  process  training  task  performance  data  from  the  trainees  and  establish  when  the 
performance  appears  to  deviate  from  the  current  task  goal.  Any  deviation  will  invoke 
explicit  behavior  by  the  coach  to  take  appropriate  action  such  as  reiterate  the  current  goal, 
remind  the  trainees  about  overlooked  cautions  and  warnings,  or  demonstrate  the  correct 
task  behavior.  This  research  will  extend  previous  work  under  the  TIMET  project  by 
evaluating  interaction  techniques  between  coach  and  subject  as  well  as  multiple  subjects 
working  together  on  a  team.  Integration  of  wider  knowledge  sources  into  the  coach 
concept  such  as  function,  physics,  and  geometry  will  yield  suitable  system-level 
understanding  of  the  maintenance  task.  A  field  evaluation  of  the  coaching  system  in  an 
actual  training  environment  will  measure  perfonnance  impacts  by  various  team  members 
while  perfonning  training  procedures.  Preliminary  data  will  be  collected  on  the 
effectiveness  of  the  coach  to  reduce  errors  and  increase  training  effectiveness  as  well  as 
identify  limitations  for  training  applications. 

The  Anticipating  Future  Job  Aiding/Training  Requirements  Workshop  effort  will 
evaluate  overall  Air  Force  maintenance  training  needs  and  costs  versus  future  trends  in 
technology  development.  The  overall  objective  is  to  assist  the  Air  Force  Research 
Laboratory  in  identifying  and  defining  research  vectors/thrusts  related  to  future  tools, 
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technologies,  and  techniques  for  job  aiding  and  training  10  to  15  years  out.  A  small 
multidiscipline  team  of  experts  and  visionaries  in  academic  and  professional  fields 
related  to  job  aiding  and  training  have  been  assembled  from  applicable  fields  (e.g. 
educational  technology,  computer  science,  social  sciences,  etc.).  These  experts  will  come 
together  and  participate  as  part  of  a  small  working  group  that  will  help  AFRL  in  defining 
job  aiding  and  training  research  vectors/thrusts  to  address  future  Air  Force  operational 
requirements.  The  formal  product  from  this  research  task  is  a  technical  report  that 
addresses  the  following: 

a.  A  survey  of  the  future  workforce  in  general,  and  Air  Force  specifically  (e.g. 
demographics,  skills  and  education,  etc.) 

b.  An  assessment  of  the  future  work  environment  in  general,  and  Air  Force 
specifically  (e.g.  nature  of  work,  training  requirements,  etc.) 

c.  An  assessment  of  current  and  future  technologies  and  trends  focused  on 
improving  job  aiding  and  training  (e.g.  distributed  learning,  simulation,  etc.) 

d.  A  critical  and  unbiased  assessment  of  the  technology  gaps  and  training 
shortfalls  with  respect  to  the  above  survey  and  assessments 

e.  A  discussion  of  research  streams  or  vectors  which  could  be  pursued  in  the 
future  to  address  the  identified  technology  gaps  and  training  shortfalls 
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Summary 

This  paper  provided  a  brief  overview  of  the  need  for  more  innovative  ways  in 
delivering  technical  training  to  support  the  Air  Force  vision  for  a  lighter,  leaner,  agile 
force.  The  paper  explored  animated  pedagogical  agents  within  interactive  learning 
environments  as  a  technology  that  could  be  exploited  for  portable  training  augmentation 
for  the  schoolhouse  as  well  as  just-in-time  training  for  operations  in  the  field.  Several 
applications  of  virtual  training  applications  used  by  the  Department  of  Defense  and 
commercial  industry  were  discussed.  While  the  applications  were  varied,  the  end  goals  of 
the  training  applications  were  similar:  reduced  costs,  reduced  time,  improved 
communications,  distributed  widely,  and  increased  appeal  to  a  younger  generation  of 
workers.  Technical  challenges  were  discussed  highlighting  advantages  and  disadvantages 
of  two  approaches,  stored  and  dynamic,  used  to  program  behavior  of  the  agents.  The 
approach  used  will  ultimately  depend  on  customer  requirements  for  the  application.  It 
was  noted  that  a  dominant  design  has  not  emerged  as  most  applications  are  proprietary 
and  vertically  integrated  for  performance.  As  the  technological  improvements  exceed 
customer  requirements,  responsiveness  and  convenience  will  become  the  new  design 
criteria  especially  as  the  demand  for  custom  training  applications  increase.  We  will  see 
more  interoperability  and  modular  designs  when  this  happens  (Christensen  &  Raynor, 
2003).  Finally,  a  deliberate  research  method  was  laid  out  in  an  attempt  to  build  up 
empirical  knowledge  on  the  technology  effectiveness  and  limitations  for  use  in  Air  Force 
training.  This  innovative  technology  shows  great  potential  for  future  training.  To  what 
degree  is  still  to  be  determined. 
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